AITopics | problem file

Collaborating Authors

problem file

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Unifying Inference-Time Planning Language Generation

Kagitha, Prabhu Prakash, Sun, Bo, Desai, Ishan, Zhu, Andrew, Huang, Cassie, Li, Manling, Li, Ziyang, Zhang, Li

arXiv.org Artificial IntelligenceOct-8-2025

A line of work in planning uses LLM not to generate a plan, but to generate a formal representation in some planning language, which can be input into a symbolic solver to deterministically find a plan. While showing improved trust and promising performance, dozens of recent publications have proposed scattered methods on a variety of benchmarks under different experimental settings. We attempt to unify the inference-time LLM-as-formalizer methodology for classical planning by proposing a unifying framework based on intermediate representations. We thus systematically evaluate more than a dozen pipelines that subsume most existing work, while proposing novel ones that involve syntactically similar but high resource intermediate languages (such as a Python wrapper of PDDL). We provide recipes for planning language generation pipelines, draw a series of conclusions showing the efficacy of their various components, and evidence their robustness against problem complexity.

logic & formal reasoning, natural language, pddl, (20 more...)

arXiv.org Artificial Intelligence

2505.14763

Country:

North America > United States (1.00)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.93)

Add feedback

PSALM-V: Automating Symbolic Planning in Interactive Visual Environments with Large Language Models

Zhu, Wang Bill, Chai, Miaosen, Singh, Ishika, Jia, Robin, Thomason, Jesse

arXiv.org Artificial IntelligenceJun-26-2025

We propose PSALM-V, the first autonomous neuro-symbolic learning system able to induce symbolic action semantics (i.e., pre- and post-conditions) in visual environments through interaction. PSALM-V bootstraps reliable symbolic planning without expert action definitions, using LLMs to generate heuristic plans and candidate symbolic semantics. Previous work has explored using large language models to generate action semantics for Planning Domain Definition Language (PDDL)-based symbolic planners. However, these approaches have primarily focused on text-based domains or relied on unrealistic assumptions, such as access to a predefined problem file, full observability, or explicit error messages. By contrast, PSALM-V dynamically infers PDDL problem files and domain action semantics by analyzing execution outcomes and synthesizing possible error explanations. The system iteratively generates and executes plans while maintaining a tree-structured belief over possible action semantics for each action, iteratively refining these beliefs until a goal state is reached. Simulated experiments of task completion in ALFRED demonstrate that PSALM-V increases the plan success rate from 37% (Claude-3.7) to 74% in partially observed setups. Results on two 2D game environments, RTFM and Overcooked-AI, show that PSALM-V improves step efficiency and succeeds in domain induction in multi-agent settings. PSALM-V correctly induces PDDL pre- and post-conditions for real-world robot BlocksWorld tasks, despite low-level manipulation failures from the robot.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.20097

Country: North America > Mexico (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Zero-Shot Iterative Formalization and Planning in Partially Observable Environments

Gong, Liancheng, Zhu, Wang, Thomason, Jesse, Zhang, Li

arXiv.org Artificial IntelligenceMay-21-2025

Using LLMs not to predict plans but to formalize an environment into the Planning Domain Definition Language (PDDL) has been shown to improve performance and control. Existing work focuses on fully observable environments; we tackle the more realistic and challenging partially observable environments that lack of complete, reliable information. We propose PDDLego+, a framework to iteratively formalize, plan, grow, and refine PDDL representations in a zero-shot manner, without needing access to any existing trajectories. On two textual simulated environments, we show that PDDLego+ improves goal reaching success and exhibits robustness against problem complexity. We also show that the domain knowledge captured after a successful trial can benefit future tasks.

large language model, machine learning, receptacle, (20 more...)

arXiv.org Artificial Intelligence

2505.13126

Country:

North America > Mexico (0.28)
North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages

Zuo, Max, Velez, Francisco Piedrahita, Li, Xiaochen, Littman, Michael L., Bach, Stephen H.

arXiv.org Artificial IntelligenceJul-3-2024

Many recent works have explored using language models for planning problems. One line of research focuses on translating natural language descriptions of planning tasks into structured planning languages, such as the planning domain definition language (PDDL). While this approach is promising, accurately measuring the quality of generated PDDL code continues to pose significant challenges. First, generated PDDL code is typically evaluated using planning validators that check whether the problem can be solved with a planner. This method is insufficient because a language model might generate valid PDDL code that does not align with the natural language description of the task. Second, existing evaluation sets often have natural language descriptions of the planning task that closely resemble the ground truth PDDL, reducing the challenge of the task. To bridge this gap, we introduce \benchmarkName, a benchmark designed to evaluate language models' ability to generate PDDL code from natural language descriptions of planning tasks. We begin by creating a PDDL equivalence algorithm that rigorously evaluates the correctness of PDDL code generated by language models by flexibly comparing it against a ground truth PDDL. Then, we present a dataset of $132,037$ text-to-PDDL pairs across 13 different tasks, with varying levels of difficulty. Finally, we evaluate several API-access and open-weight language models that reveal this task's complexity. For example, $87.6\%$ of the PDDL problem descriptions generated by GPT-4o are syntactically parseable, $82.2\%$ are valid, solve-able problems, but only $35.1\%$ are semantically correct, highlighting the need for a more rigorous benchmark for this problem.

graph, planning problem, proposition, (16 more...)

arXiv.org Artificial Intelligence

2407.03321

Country:

North America > United States (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

PDDLEGO: Iterative Planning in Textual Environments

Zhang, Li, Jansen, Peter, Zhang, Tianyi, Clark, Peter, Callison-Burch, Chris, Tandon, Niket

arXiv.org Artificial IntelligenceMay-30-2024

Planning in textual environments have been shown to be a long-standing challenge even for current models. A recent, promising line of work uses LLMs to generate a formal representation of the environment that can be solved by a symbolic planner. However, existing methods rely on a fully-observed environment where all entity states are initially known, so a one-off representation can be constructed, leading to a complete plan. In contrast, we tackle partially-observed environments where there is initially no sufficient information to plan for the end-goal. We propose PDDLEGO that iteratively construct a planning representation that can lead to a partial plan for a given sub-goal. By accomplishing the sub-goal, more information is acquired to augment the representation, eventually achieving the end-goal. We show that plans produced by few-shot PDDLEGO are 43% more efficient than generating plans end-to-end on the Coin Collector simulation, with strong performance (98%) on the more complex Cooking World simulation where end-to-end LLMs fail to generate coherent plans (4%).

domain file, ingredient, problem file, (15 more...)

arXiv.org Artificial Intelligence

2405.19793

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Arizona (0.04)
Europe > United Kingdom > Scotland > City of Aberdeen > Aberdeen (0.04)

Genre: Research Report (0.40)

Industry: Government (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

DELTA: Decomposed Efficient Long-Term Robot Task Planning using Large Language Models

Liu, Yuchen, Palmieri, Luigi, Koch, Sebastian, Georgievski, Ilche, Aiello, Marco

arXiv.org Artificial IntelligenceApr-4-2024

Recent advancements in Large Language Models (LLMs) have sparked a revolution across various research fields. In particular, the integration of common-sense knowledge from LLMs into robot task and motion planning has been proven to be a game-changer, elevating performance in terms of explainability and downstream task efficiency to unprecedented heights. However, managing the vast knowledge encapsulated within these large models has posed challenges, often resulting in infeasible plans generated by LLM-based planning systems due to hallucinations or missing domain information. To overcome these challenges and obtain even greater planning feasibility and computational efficiency, we propose a novel LLM-driven task planning approach called DELTA. For achieving better grounding from environmental topology into actionable knowledge, DELTA leverages the power of scene graphs as environment representations within LLMs, enabling the fast generation of precise planning problem descriptions. For obtaining higher planning performance, we use LLMs to decompose the long-term task goals into an autoregressive sequence of sub-goals for an automated task planner to solve. Our contribution enables a more efficient and fully automatic task planning pipeline, achieving higher planning success rates and significantly shorter planning times compared to the state of the art.

delta, llm, scene graph, (15 more...)

arXiv.org Artificial Intelligence

2404.03275

Country: Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning

Zhou, Zhehua, Song, Jiayang, Yao, Kunpeng, Shu, Zhan, Ma, Lei

arXiv.org Artificial IntelligenceAug-25-2023

Motivated by the substantial achievements observed in Large Language Models (LLMs) in the field of natural language processing, recent research has commenced investigations into the application of LLMs for complex, long-horizon sequential task planning challenges in robotics. LLMs are advantageous in offering the potential to enhance the generalizability as task-agnostic planners and facilitate flexible interaction between human instructors and planning systems. However, task plans generated by LLMs often lack feasibility and correctness. To address this challenge, we introduce ISR-LLM, a novel framework that improves LLM-based planning through an iterative self-refinement process. The framework operates through three sequential steps: preprocessing, planning, and iterative self-refinement. During preprocessing, an LLM translator is employed to convert natural language input into a Planning Domain Definition Language (PDDL) formulation. In the planning phase, an LLM planner formulates an initial plan, which is then assessed and refined in the iterative self-refinement step by using a validator. We examine the performance of ISR-LLM across three distinct planning domains. The results show that ISR-LLM is able to achieve markedly higher success rates in task accomplishments compared to state-of-the-art LLM-based planners. Moreover, it also preserves the broad applicability and generalizability of working with natural language instructions.

large language model, natural language, precondition, (19 more...)

arXiv.org Artificial Intelligence

2308.13724

Country:

North America > Canada > Alberta (0.14)
North America > United States > Massachusetts (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Concise QBF Encodings for Games on a Grid (extended version)

Shaik, Irfansha, van de Pol, Jaco

arXiv.org Artificial IntelligenceMar-29-2023

Encoding 2-player games in QBF correctly and efficiently is challenging and error-prone. To enable concise specifications and uniform encodings of games played on grid boards, like Tic-Tac-Toe, Connect-4, Domineering, Pursuer-Evader and Breakthrough, we introduce BDDL - Board-game Domain Definition Language, inspired by the success of PDDL in the planning domain. We provide an efficient translation from BDDL into QBF, encoding the existence of a winning strategy of bounded depth. Our lifted encoding treats board positions symbolically and allows concise definitions of conditions, effects and winning configurations, relative to symbolic board positions. The size of the encoding grows linearly in the input model and the considered depth. To show the feasibility of such a generic approach, we use QBF solvers to compute the critical depths of winning strategies for instances of several known games. For several games, our work provides the first QBF encoding. Unlike plan validation in SAT-based planning, validating QBF-based winning strategies is difficult. We show how to validate winning strategies using QBF certificates and interactive game play.

artificial intelligence, planning & scheduling, precondition, (13 more...)

arXiv.org Artificial Intelligence

2303.16949

Country:

Europe > Denmark (0.04)
Europe > Germany > Saarland (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.66)

Add feedback

Automated Control and Simulation of Dynamic Robot Teams in the Domain of CFK Production

Glück, Roland, Körber, Marian

arXiv.org Artificial IntelligenceOct-20-2022

This paper is concerned with the automation and simulation of pick and place processes in the domain of CFK aircraft production. We introduce a workflow which starts from a CAD construction, extracts relevant data out of it, assigns grippers to the CFK pieces and schedules the single steps using a PDDL solver. Finally, the result is visualized in Blender where also prior mistakes can be identified.

artificial intelligence, planning & scheduling, ply, (16 more...)

arXiv.org Artificial Intelligence

2210.11213

Country: Europe > Portugal > Porto > Porto (0.04)

Genre:

Research Report (0.50)
Workflow (0.49)

Industry: Leisure & Entertainment > Games (0.51)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

Task Scoping: Building Goal-Specific Abstractions for Planning in Complex Domains

Kumar, Nishanth, Fishman, Michael, Danas, Natasha, Littman, Michael, Tellex, Stefanie, Konidaris, George

arXiv.org Artificial IntelligenceOct-17-2020

A generally intelligent agent requires an open-scope world model: one rich enough to tackle any of the wide range of tasks it may be asked to solve over its operational lifetime. Unfortunately, planning to solve any specific task using such a rich model is computationally intractable - even for state-of-the-art methods - due to the many states and actions that are necessarily present in the model but irrelevant to that problem. We propose task scoping: a method that exploits knowledge of the initial condition, goal condition, and transition-dynamics structure of a task to automatically and efficiently prune provably irrelevant factors and actions from a planning problem, which can dramatically decrease planning time. We prove that task scoping never deletes relevant factors or actions, characterize its computational complexity, and characterize the planning problems for which it is especially useful. Finally, we empirically evaluate task scoping on a variety of domains and demonstrate that using it as a pre-planning step can reduce the state-action space of various planning problems by orders of magnitude and speed up planning. When applied to a complex Minecraft domain, our approach speeds up a state-of-the-art planner by 30 times, including the time required for task scoping itself.

artificial intelligence, planning & scheduling, planning problem, (18 more...)

arXiv.org Artificial Intelligence

2010.08869

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.84)

Industry: Leisure & Entertainment > Games > Computer Games (0.49)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback